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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: DOUGLAS SMITH 
(ii) TITLE OF INVENTION: NUCLEIC ACID AND AMINO ACID SEQUENCES 

RELATING TO HELICOBACTER PYLORI FOR 
DIAGNOSTICS AND THERAPEUTICS 
(iii) NUMBER OF SEQUENCES: 941 
(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: LAHIVE & COCKFIELD, LLP 

(B) STREET: 28 State Street 

(C) CITY: Boston 

(D) STATE: Massachusetts 

(E) COUNTRY: USA 

(F) ZIP : 02109 
(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 
(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US/08/487 , 032B 

(B) FILING DATE: 07-Jun-1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Mandragouras , Amy E. 

(B) REGISTRATION NUMBER: 36,207 

(C) REFERENCE /DOCKET NUMBER: GTN-001 
(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (617)227-7400 

(B) TELEFAX: (617)227-5941 



BECEIVED 

DEC 1 « 2» W 



Ooes Not Comply^ 
corrected DfekatteNeeoeo 



ERRORED SEQUENCES 

18782 (2) INFORMATION FOR SEQ ID NO: 458: 



18784 (i) SEQUENCE CHARACTERISTICS: 

18785 (A) LENGTH: 122 amino acids O 

18786 (B) TYPE: amino acid Q IS 

18787 (D) TOPOLOGY: linear Vr 
18789 (ii) MOLECULE TYPE: protein ^ 
18791 (iii) HYPOTHETICAL: YES 

18793 (vi) ORIGINAL SOURCE: 

18794 (A) ORGANISM: Helicobacter pylori 
18796 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 458: 
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E — > 



18798 
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19508 
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19520 
19522 
19523 
19525 
19526 
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19544 
19546 
19547 
19549 
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(2) 


INFORMATION 


FOR 


SEQ 


ID NO: ■ 


473: 

















(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 203 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: YES 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Helicobacter pylori 
<ix) FEATURE: 

(A) NAME /KEY : misc_feature 

(B) LOCATION: 1...203 

(D) OTHER INFORMATION: /note- "flagellar biosynthesis protein flha" 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 473: 
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" ~<L9550 

19552 
19553 
19555 
19556 
19558 
19559 
20574 
20576 
20577 
20578 
20579 
20581 
20583 
20585 
20586 
20588 
20590 
20591 
20593 
20594 
20596 
20597 
20599 
20600 
20602 
20603 
20605 
20606 
20608 
20609 
20611 
20612 

> 20614 
20615 

> 20617 
20618 



\> 20620 
E— > 20621 

20623 
20624 
20626 
20627 
20629 
E--> 20630 
20810 
20812 
20813 
20814 



Input Set : D:\Seqlistcorr2.txt 
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150 155 160 

Glu Pro Asn Leu Arg Lys Ala Leu Ser Lys Gin Met Glu Gin Ala Arg 

165 170 175 

Asn Asp Gly Leu Val Leu Ser His Ala Glu Leu Asp Pro Asn Ser Asn 

180 185 190 

Phe Glu Ala Leu Gly Thr lie His lie Asn Phe 

195 200 
(2) INFORMATION FOR SEQ ID NO: 4 90: 

(i) SEQUENCE CHAR ACT pRTST TCS : A Q / j 

(A) LENGTH : (T 6 6^Enuno^acids °^ ' ' 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(iii) HYPOTHETICAL: YES 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Helicobacter pylori 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 490: 
Val Gly Val He Tyr He Val Thr Thr Asn Thr Leu Asn He Leu Ser 
15 10 15 

Cys Glu Ser Phe Glu He Leu Glu Lys Arg Glu Leu Asp Thr Ser Gly 

20 25 30 

Val Thr Lys Thr Ser Thr Pro Phe Phe Ser Arg Val Glu Gly He Asp 

35 40 45 

Ala Gly Thr Leu Gly Lys Leu Phe Ser Gly Ser Gin Ser Lys Asn Tyr 

50 55 60 

Phe Ala Tyr Tyr Asp Ala Leu Val Lys Lys Glu Lys Arg Lys Glu Val 
65 70 75 80 

Arg He Glu Lys Lys Glu Glu Arg He Asp Ala Arg Glu Asn Lys Arg 

85 90 95 

Glu He Lys Gin Glu Ala He Lys Glu Pro Lys Lys Ala Asn Gin Gly 

100 105 110 

Thr Glu Asn Ala Pro Thr Leu Glu Glu Lys Xaa Tyr Gin Xaa Ala Glu 

115 120 125 

Arg Lys Phe Asp Ala Lys Xaa Xaa Arg Asp Arg Ser Xaa Asp Glu Xaa 

130 135 140 

Lys Lys Thr Xaa Pro Pro Lys Xaa Leu Trp Asn Leu Lys Lys Glu Lys 
145 150 155 160 

Lys Ser Met Xaa Lys Glu Xaa Glu Lys Glu Thr Glu Glu Arg Arg Lys 

165 170 * \% 

Ala Leu Glu Met Asp Lys Glu Asn Glu Lys Val Asn Ala Lys Glu Asn 

180 185 190 

Glu Arg Glu He Asn Gin Glu Ser Ala Asn Glu Pro Ser Ser Glu Asn 

195 200 205 

Thr Pro Leu 
210 

(2) INFORMATION FOR SEQ ID NO: 4 94: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 78 amino acids 

(B) TYPE: amino acid 
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20815 (D) TOPOLOGY: linear 

20817 (ii) MOLECULE TYPE: protein 

20819 (iii) HYPOTHETICAL: YES 

20821 (vi) ORIGINAL SOURCE: 

20822 (A) ORGANISM: Helicobacter pylori 
20824 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 494: 

*> 20826 Met Glu Asn Pro Ser Leu Gly Ser Asn Pro Leu Xaa Gin Lys Ala Met 

20827 1 5 10 15 



E — > 20829 Lys Asn Lys Xaa lie Ser Lys Ser Leu Pro Tyi( Try frrg Lys __M££ Prn — \ — j 

20830 * 20 25 3 ° 

W^-> 20832 Asn Gly Ala Glu Val Tyr Gly Val Xaa lie Leu Leu Pro Leu Phe Lys 

f\\ ^/2'0833 35 40 45 

Hw\k> 20835 Glu Asn Thr Xaa Xaa Trp Trp Gly Val Leu Met lie Phe Phe lie Xaa 

\J 20836 50 55 60 

W — > 20838 Xaa Xaa Val Met Lys Ser Leu Lys Thr Gly Ala lie Tyr Phe 

20839 65 70 75 

20996 (2) INFORMATION FOR SEQ ID NO: 497: . 

20998 (i) SEQUENCE CHAR&gTpRT STTCS : \ QQ il 

20999 (A) LENGTH : M^nSmino acids ^ / 7 yWU'h 

21000 (B) TYPE: amino acid 

21001 (D) TOPOLOGY: linear 
21003 (ii) MOLECULE TYPE: protein 
21005 (iii) HYPOTHETICAL: YES 

21007 (vi) ORIGINAL SOURCE: 

21008 (A) ORGANISM: Helicobacter pylori 
21010 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 497: 

21012 Met Pro Val He He Gly Tyr He Cys Thr Gly Thr Val Leu Ala Ala 

21013 15 10 15 

21015 Phe Phe Lys He Asn Asp Phe Asn Leu Leu Ser Asp He Gly Glu Phe 

21016 20 25 . 30 

21018 Gly He Val Phe Leu Met Phe Lys Lys Gly He Glu Phe Asn Phe Asp 

21019 35 40 45 

21021 Lys Leu Lys Ser He Lys Gin Glu Val Leu Val Phe Gly Leu Leu Gin 

21022 50 55 60 

21024 Val Val Leu Cys Ala Leu He Ala Phe Leu Leu Gly Tyr Phe Val Leu 

21025 65 70 75 80 

21027 Gly Leu Ser Pro Phe Phe Pro Leu Phe Leu Ala Trp Gly Phe His Ser 

21028 85 90 95 
E— > 21030 Leu Gin Pro 

22260 (2) INFORMATION FOR SEQ ID NO: 522: 

22262 (i) SEQUENCE CHARACTERISTICS: 

22263 (A) LENGTH: 107 amino acids 

22264 (B) TYPE: amino acid 

22265 (D) TOPOLOGY: linear 
22267 (ii) MOLECULE TYPE: protein 
22269 (iii) HYPOTHETICAL: YES 

22271 (vi) ORIGINAL SOURCE: 

22272 (A) ORGANISM: Helicobacter pylori 
22274 (ix) FEATURE: 
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22275 (A) NAME /KEY : misc_feature 

22276 (B) LOCATION: 1...107 

22277 (D) OTHER INFORMATION: /note= "L-lactate permease" 
22279 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 522: 

22281 Val Phe Leu Thr Gly Ser Asp Thr Ser Ser Asn Leu Leu Phe Gly Ser 

22282 15 10 15 

22284 Leu Gin Met Val He Ala Thr Gin Leu Gly Leu Pro Glu Val Leu Phe 

22285 20 25 30 

22287 Leu Ala Ala Asn Thr Ser Gly Gly Val Val Gly Lys Met He Ser Pro 

22288 35 40 £5 



E— > 22290 Gin Ser He Ala Ile/Als ICys Ala Ala Val Gly Leu Val Gly Lys Glu' 
22291 '50 ' - 1^/55 60 

22293 Ser Glu Met Phe Arg Phe Thr Val Lys Tyr Ser He Ala Leu Ala He 

22294 65 70 75 80 

22296 He Met Gly He Val Leu His Ser Tyr Arg Leu Cys Phe Leu Leu Tyr 

22297 85 90 95 

22299 Tyr Ser Ser Tyr Ser Tyr Leu Met Glu Gly Val 

22300 100 105 
23164 (2) INFORMATION FOR SEQ ID NO: 540: , 

23166 (i) SEQUENCE CHARA6ftERISTI^_ 7 / T 7 /)ZS*LSV\ 

23167 (A) LENGTH -/l 4 e^nurioacicis ' ' ' ^^ a0jy ^ 

23168 (B) TYPE: anPtnt< acid 

23169 (D) TOPOLOGY: linear 
23171 (ii) MOLECULE TYPE: protein 
23173 (iii) HYPOTHETICAL: YES 

23175 (vi) ORIGINAL SOURCE: 

23176 (A) ORGANISM: Helicobacter pylori 
23178 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 540: 

23180 Met Glu Val Met Asp Glu Ala Leu Leu Arg Ser Lys Arg Phe Asp Arg 

23181 15 10 15 

23183 Arg He Phe He Ser Leu Pro Asp Leu Leu Glu Arg Gin Ser He Leu 

23184 20 25 30 

23186 Glu Lys Leu Leu Glu Asn Lys Lys His Ala Leu Asp Tyr Leu Lys He 

23187 35 40 45 

23189 Ala Lys He Cys Val Gly Phe Ser Gly Ala Met Leu Ala Thr Leu He 

23190 /^50 55 60 * 

E — > 23192 (Aln iGlu Ser Ala Leu Asn Ala Leu Lys His Gin Arg Lys Glu He Thr //V^^ 
23193 70 75 80 

23195 His Gly Asp He Leu Glu Val Lys Asp Lys He Ala Tyr Gly Lys Lys 

23196 85 90 95 

23198 Lys Pro Gin Thr Leu Asp Glu Asn Gin Lys Glu Leu Val Ala Leu Tyr 

23199 100 105 110 

23201 Gin Ser Ala Lys Ala Leu Ser Ala Tyr Trp Leu Glu He Glu Phe Asp 

23202 115 120 125 

23204 Lys Ala Ser Leu Leu Gly Glu Phe He Ala Phe Asn Glu Asn Lys He 

23205 130 135 140 
23207 His Ala Arg Ala Arg 

E~> 23208 145 

24988 (2) INFORMATION FOR SEQ ID NO: 578: 
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t 



24990 (i) SEQUENCE r.H ARAj GyF.RTSTTC.S : /l/AJA 

24991 (A) LENGTH :(90Amino acids ' f ^f^ 0 ^ 

24992 (B) TYPE: amino acid 

24993 (D) TOPOLOGY: linear 
24995 (ii) MOLECULE TYPE: protein 
24997 (iii) HYPOTHETICAL: YES 

24999 (vi) ORIGINAL SOURCE: 

25000 (A) ORGANISM: Helicobacter pylori 
250,0-2 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 578: 
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(2) 


INFORMATION 


FOR 


SEQ 


ID NO: 1 


587: 















254 97 (i) SEQUENCE CHARACTERISTICS: 

254 98 (A) LENGTH: 182 amino acids 

25499 (B) TYPE: amino acid 

25500 (D) TOPOLOGY 1 : linear- 
25502 (ii) MOLECULE TYPE: protein 
25504 (iii) HYPOTHETICAL: YES 

25506 (vi) ORIGINAL SOURCE: 

25507 (A) ORGANISM: Helicobacter pylori 

25509 * (ix) FEATURE: 

25510 (A) NAME /KEY : misc_feature 

25511 (B) LOCATION: 1 . . . 182 

25512 (D) OTHER INFORMATION: /note= "cation efflux system 

25513 proteins' 1 * 
25515 (xi) SEQUENCE DES CRIPTION: SEQ ID NO: 587: V , nki^J 

E— > 25517 Val He Val Gly Ala (le /Leu Val Leu Phe Phe Gly Thr Thr Ser Phe yA ^^ 



25518 


1- 








5 








10 










15 




^)h>, 25520 

v 25521 


He 


Asn 


Thr 


Pro 

20 


Val 


Asp 


Ala 


Phe 


Xaa Asp 

25 


He 


Ser 


Pro 


Thr 

30 


Gin 


Val 


25523 


Lys 


He 


He 


Leu 


Lys 


Leu 


Pro 


Gly 


Ser Ser 


Pro 


Glu 


Glu 


Met 


Glu 


Asn 


25524 






35 










40 








45 








25526 


Asn 


He 


Ala 


Arg 


Pro 


Leu 


Glu 


Leu 


Glu Leu 


Leu 


Gly 


Leu 


Lys 


Gly 


Gin 


25527 




50 










55 








60 










25529 


Lys 


Ser 


Leu 


Arg 


Ser 


He 


Ser 


Lys 


Tyr Ser 


He 


Ser 


Asp 


He 


Thr 


He 


25530 


65 










70 








75 










80 


25532 


Asp 


Phe 


Asp 


Asp 


Ser 


Val 


Asp 


He 


Tyr Leu 


Ala 


Arg 


Asn 


He 


Val 


Asn 


25533 










85 








90 










95 




25535 


Glu 


Arg 


Leu 


Ser 


Ser 


Val 


Met 


Lys 


Asp Leu 


Pro 


Val 


Gly 


Val 


Glu 


Gly 



file://C:\CRF4\Outhold\VsrH487032B.htm 

L 



1 1/27/02 



Page 7 of 16 * 



RAW SEQUENCE LISTING DATE: 11/27/2002 

PATENT APPLICATION: US/08/487 , 032B TIME: 09:12:45 

Input Set : D:\Seqlistcorr2.txt 

Output Set: N:\CRF4\11272002\H487032B.raw 



o c c o c 






100 








1UO 










11U 








Gly 


Met 


Ala Pro He 


Val 


Thr 


Pro 


Leu 


ber 


Asp 


lie 


Pne 


Met 


rne 


inr 


o c c; o n 
255 3 9 






115 






120 










IOC 

125 








25541 


He 


Asp 


Gly Asn He 


Thr 


Glu 


He 


Glu 


Lys 


Arg 


Gin 


Leu 


Leu 


Asp 


Pne 






130 






135 










14 U 










o c a A A 


val 


He 


Arg Pro Gin 


Leu 


Arg 


Met 


He 


Ser 


Gly 


Val 


Ala 


Asp 


val 


Asn 


2554 5 


145 






150 










155 










160 


25547 


Ser 


He 


Gly Gly Phe 


Ser 


Arg 


Ala 


Phe 


Val 


He 


Val 


Pro 


Asp 


Phe 


Asn 


2554 8 






165 










170 










175 




25550 


Asp 


Met 


Ala Arg Leu 


Gly 






















25551 






180 
























31199 


(2) 


INFORMATION FOR 


SEQ 


ID NO: 


690: 
















31201 




(i) 


SEQUENCE CHARACTERISTICS: 
















31202 






(A) LENGTH: 14 4 amino 


acids 














31203 






(B) TYPE: 


amino acid 


















31204 






(D) TOPOLOGY: 


linear 


















31206 




(ii) 


MOLECULE TYPE: 


protein 


















31208 




(iii) 


HYPOTHETICAL: YES 




















31210 




(vi) 


ORIGINAL SOURCE: 




















31211 






(A) ORGANISM: 


Helicobacter pylori 












31213 




(xi) 


SEQUENCE DESCRIPTION: i 


SEQ : 


ID NO: 690: 










31215 


Met 


Lys 


Asn Leu Arg 


His 


Phe 


Arg 


Lys 


Leu 


He 


Ala 


Phe 


Leu 


Gly 


Phe 


JtlZ 1 D 


1 




5 










10 










15 




31218 


Ser 


Pro 


Leu Leu Leu 


Gin 


Ala 


Asp 


Met 


Thr 


Thr 


Phe 


Phe 


Asn 


Ser 


He 


31219 






20 








25 










30 






31221 


Glu 


Gin 


Gin Leu Thr 


Ser 


Pro 


Thr 


Ala 


Lys 


Gly 


He 


Leu 


Met 


Val 


He 


31222 






35 






40 










45 








31224 


Phe 


Leu 


Gly Leu Ala 


He 


Phe 


He 


Trp 


Lys 


Asn 


Leu 


Asp 


Arg 


Trp 


Lys 


31225 




50 






55 










60 










31227 


Glu 


He 


Leu Met Thr 


Val 


Leu 


Ala 


Leu 


Lys 


Glu 


Val 


Pro 


Met 


Gin 


Tyr 


31228 


65 






70 










75 










80 


31230 


Phe 


He 


Pro Ala Ser 


Asn 


Leu 


Lys 


Glu 


He 


Ser 


Ser 


Lys 


Glu 


Lys 


Phe 


31231 






85 










90 










95 




31233 


Leu 


[ Trp \Leu Asn Ala 


Lys 


Ser 


S 


Leu 


Leu 


Ser 


Gly 


Phe 


Val 


Pro 


Phe 


31234 






' 100 








105 










110 






31236 


lie 


Met 


He Pro Trp 


Leu 


Asp 


He 


Leu 


Asn 


Ser 


Phe 


Val 


Leu 


Tyr 


Val 


31237 






115 






120 










125 








31239 


Cys 


Phe 


Leu Leu He 


Phe 


Ser 


He 


Ala 


Glu 


Phe 


Phe 


Asp 


Glu 


Asp 


He 


31240 




130 






135 










140" 










32796 


(2) 


INFORMATION FOR 


SEQ 


ID NO: 723: 

















32798 (i) SEQUENCE CHARACTERISTICS: 

32799 (A) LENGTH: 174 amino acids 

32800 (B) TYPE: amino acid 

32801 (D) TOPOLOGY: linear 
32803 (ii) MOLECULE TYPE: protein 
32805 (iii) HYPOTHETICAL: YES 

32807 (vi) ORIGINAL SOURCE: 

32808 (A) ORGANISM: Helicobacter pylori 
32810 (ix) FEATURE: 
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RAW SEQUENCE LISTING DATE: 11/27/2002 

PATENT APPLICATION: US/08/487 , 032B TIME: 09:12:46 

Input Set : D:\Seqlistcorr2.txt 

Output Set: N:\CRF4\ll272002\H487032B.raw 

32811 (A) NAME/KEY: misc_f eature 

32812 (B) LOCATION: 1...174 

32813 (D) OTHER INFORMATION: /note= "POTASSIUM/COPPER-TRA^SPORTING 

32814 ATPASE A" , ^ J 
32816 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72^^- — - — ) y^^^ 

E — > 32818 Met Arg Cys Glu Leu Trp Arg Arg Tyr Gly Gln(Htr jHis Ala Lys Glu 

32819 1 * 5 h 10 V ^ ■ 15 

32821 Phe gly Pro Tyr Arg Tyr Leu Lys Leu Val Gly Ala Ser Gly Val Gly 

32822 20 25 30 

32824 Arg Phe Phe He Lys Gly Ala Phe Tyr Gly Leu Lys Asn Gly Val Leu 
3282*5 35 40 45 

32827 Gly Met Asp Leu Ser Val Ser Phe Gly Ala Leu Ser Ala Phe Val Tyr 

32828 50 55 60 

32830 Ser Leu Tyr Ala Met Leu Val Ser Gin Glu Thr Tyr Phe Gly Ala Ser 

32831 65 70 75 80 



E — > 32833 Ser Thr He Leu Thr Leu Val Phe Gly Ser Lys Phe Leu GlufLeyjLys 

32834 85 90 \^ 

32836 Ala Arg Leu Phe Ala Asn Glu Lys Cys leu Ala Leu Glu Ser His Glu 

32837 100 105 110 

32839 He His Ser Val He Val Val Glu Asn Gly Lys Gin lie Glu Lys His 

32840 115 120 125 

32842 Pro Lys Asp Val Ala He Gly Ser Val Val Trp Val Pro Ser Gly Ala 

32843 130 135 140 

32845 Lys lie Ala leu Asp Gly Val Leu Leu Lys Ser Ala Ser Val Asp Ala 

32846 145 150 155 160 

32848 Ser Leu He Ser Gly Glu Phe Lys Pro Leu Glu He Gly Gly 

32849 165 170 
32974 (2) INFORMATION FOR SEQ ID NO: 727: 

32976 (i) SEQUENCE CHARACTERISTICS: 

32977 (A) LENGTH: 77 amino acids 

32978 (B) TYPE: amino acid 

32979 (D) TOPOLOGY: linear 
32 981 (ii) MOLECULE TYPE: protein 
32983 (iii) HYPOTHETICAL: YES 

32 985 (vi) ORIGINAL SOURCE: 

32986 (A) ORGANISM: Helicobacter pylori 

32988 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 727: 

32990 Val His Ser Lys Glu Arg Lys Glu Phe Leu Val Ser Glu Phe Lys Ala 

32991 15 10 15 

32993 Ser Ala Val Glu Met Glu Gly Ala Ser Val Ala Phe Val Cys Gin Lys * i r 
E~> 32994 3& ZO 25 " 30 JLJU^T A>U<n4^^A 

32996 Phe Gly Val Pro Cys Cys Val Leu Arg Ser lie Ser Asp Asn Ala Asp ^ 
E-\-> 32997 35 ' 40 45 

/f\ / 32999 Glu Lys Ala Gly Met Ser Phe Asp Glu Phe Leu Glu Lys Ser Ala His 
[ eA-^/33000 50 55 ' 60 

I J . 33002 Thr Ser Ala Lys Phe Leu Lys Ser Met Val Asp Glu Leu 
\A1> 33003 65 % 70 75 

37478 (2) INFORMATION FOR SEQ ID NO: 821: 

37 4 80 (i) SEQUENCE CHARACTERISTICS: 
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RAW SEQUENCE LISTING DATE: 11/27/2002 

PATENT APPLICATION: US/08/487 , 032B TIME: 09:12:46 

Input Set : D:\Seqlistcorr2.txt 
Output Set: N:\CRF4\11272002\H487032B.raw 

37481 (A) LENGTH: 84 amino acids 

37482 (B) TYPE: amino acid 

37483 (D) TOPOLOGY: linear 
37485 (ii) MOLECULE TYPE: protein 
37487 (iii) HYPOTHETICAL: YES 
37489 (vi) ORIGINAL SOURCE: 

374 90 (A) ORGANISM: Helicobacter pylori 

37492 (ix) FEATURE: 

374 93 (A) NAME /KEY : misc_feature 

37494 (B) LOCATION: 1...84 

37495 (D) OTHER INFORMATION: /note- "Plasmodium falciparum gametocyte 

37496 specific antigen" 
E — > 37498 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: I 

37500 Val Leu Val Val Gly Lys Pro Asn Glu Ser Tyr^Ala Asp Thr His Ala 

37501 15 10 15 

37503 Arg He Glu His Phe He Lys Leu Val Asp Phe Lys Gly Glu He Val 

37504 20 25 30 

37506 Phe He Asn Glu Asp Asn Ser Ser Val Glu Ala Tyr Glu Asn Leu Glu 

37507 35 40 45 

37509 His Leu Gly Lys Lys Asn Lys Arg He Ala Thr Lys Asp Gly Arg Leu 

37510 50 55 60 

37512 Asp Ser Leu Ser Ala Cys Arg He Leu Glu Arg Tyr Cys Gin Gin Val 

37513 65 70 75 80 
37515 Leu Lys Lys Gly 

38831 (2) INFORMATION FOR SEQ ID NO: 849: 

38833 (i) SEQUENCE CHARACTERISTICS: 

38834 (A) LENGTH: 4 6 amino acids 

38835 (B) TYPE: amino acid 

38836 (D) TOPOLOGY: linear 
38838 (ii) MOLECULE TYPE: protein 
38840 (iii) HYPOTHETICAL: YES 

38842 (vi) ORIGINAL SOURCE: 

38843 (A) ORGANISM: Helicobacter pylori 
38845 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 849: 

38847 Val He Val Gly Val Gly Lys Ser Ala Leu Val Ala Gin Lys He Val 

38848 1.5 10 15 

38850 Ala Ser Met Leu Ser Thr Gly Asn Arg Ser Ala Phe Leu His Pro Thr 

38851 20 25 /^*\ ^ I 
E — > 38853 Glu Ala Met His Gly Asp Leu Gly Met Gly Lys Lys Thr^Mer) /yy^j-tJb^l 

38854 35 40 45 

39047 (2) INFORMATION FOR SEQ ID NO: 854: 

39049 (i) SEQUENCE CHARACTERISTICS: 

39050 (A) LENGTH: 153 amino acids 

39051 (B) TYPE: amino acid 

39052 (D) TOPOLOGY: linear 
39054 (ii) MOLECULE TYPE: protein 
39056 (iii) HYPOTHETICAL: YES 

39058 (vi) ORIGINAL SOURCE: 

39059 (A) ORGANISM: Helicobacter pylori 
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RAW SEQUENCE LISTING 

PATENT APPLICATION: US/08/487 , 032B 



DATE: 11/27/2002 
TIME: 09:12:46 



E — > 




39061 
39063 
39064 
39066 
39067 
39069 
39070 
39072 
39073 
39075 
39076 
39078 
39079 
39081 
39082 
39084 
39085 
39087 
39088 
39090 
39091 
43542 
43544 
43545 
43546 
43547 
43549 
43551 
43553 
43554 
43556 
43558 
43559 
43561 
43562 
43564 
43565 
43567 
43568 
43570 
43571 
43573 
43574 
43576 
43577 
43579 
43580 
43582 
43583 



Input Set : D:\Seqlistcorr2.txt 

Output Set: N:\CRF4\11272002\H487032B.raw 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 854: 

Met Asp Ala Leu Glu lie Thr Gin Lys Leu lie Ser Tyr Pro Thr lie 
15 10 15 

Thr Pro Lys Glu Cys Gly lie Phe Glu Tyr lie Lys Ser Leu Phe Pro 

20 25 30 

Ala Phe Lys Thr Leu Glu Cys Glu Lys Asn Gly Val Lys Asn Leu Phe 

35 40 45 

Leu Tyr Arg lie Phe Asn Pro Leu Lys Lys His Ala Glu Lys Glu His 

50 55 60 

Ala Lys Glu Lys His Val Lys Glu Asn Val Xaa Pro Leu His Phe Cys 

65 75 80 
Leu Gin Gly lie Leuf UnkjSer Cys Leu Leu Gly Xaa Xaa Ala Xaa Asp 
85 V J QQ 25 



m .. 



Ser Phe Xaa Xaa lie lie Lys Glu Gly Phe Leu Tyr Gly Arg Gly Ala 

100 105 110 

Gin Asp Met Lys Gly Gly Val Gly Xaa Phe Leu Gly Ala Xaa Xaa Asn 

115 120 125 

Phe Asn Xaa Lys Xaa Xaa Phe Xaa Phe Leu Phe Tyr Leu Thr Ser Asp 

130 135 140 

Glu Glu Gly Thr Arg Xaa Phe Xaa His 

145 150 
(2) INFORMATION FOR SEQ ID NO: 941: 

(i) SEQUENCE CHAfiflfig gRISTICS : 

(A) LENGTH :^55~jamino acids 

(B) TYPE: arrfrno acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
( iii ) HYPOTHETICAL : YES 
(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Helicobacter pylori 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 941: 
Met Trp Leu Ser Glu His Phe Ala Ala Lys Gly Gly Asn Pro Leu Phe 

15 10 15 

Ala Pro Tyr Tyr Leu Gin Asp Asn Pro Thr Glu His lie Val Thr Leu 

20 25 30 

Met Lys Asp lie Thr Ser Ala Leu Gly Met Leu Ser Asn Ser Asn Leu 

35 40 45 

Lys Asn Asn Ser Thr Asp Val Leu Gin Leu Asn Thr Tyr Thr Gin Gin 

50 55 60 

Met Ser Arg Leu Ala Lys Leu Ser Asn Phe Ala Ser Phe Asp Ser Thr 
65 70 75 80 

Asp Phe Ser Glu Arg Leu Ser Ser Leu Lys Asn Gin Arg Phe Ala Asp 

85 90 95 

Ala Val Pro Asn Ala Met Asp Val lie Leu Lys Tyr Ser Gin Arg Asp 

100 105 110 

Lys Leu Lys Asn Asn Leu Trp Ala Thr Gly Val Gly Gly Val Ser Phe 

115 120 125 

Val Glu Asn Gly Thr Gly Thr Leu Tyr Gly Val Asn Val Gly Tyr Asp 
130 135 140 
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RAW SEQUENCE LISTING DATE : 11/27/2002 

PATENT APPLICATION: US/08/487 , 032B TIME: 09:12:47 



Input Set : D:\Seqlistcorr2.txt 

Output Set: N:\CRF4\11272002\H487032B.raw 

43585 Arg Phe Val Arg Gly Val He Val Gly Gly Tyr Ala Ala Tyr Gly Tyr 

43586 145 150 155 160 

43588 Ser Gly Phe Tyr Glu Arg He Thr Ser Ser Lys Ser Asp Asn Val Asp 

43589 165 170 175 

43591 Val Gly Met Tyr Ala Arg Ala Phe He Lys Lys Ser Glu Leu Thr Phe 

43592 180 185 190 
E— > 43594 Arg Arg 
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RAW SEQUENCE LISTING ERROR SUMMARY DATE : 11/27/2002 

PATENT APPLICATION: US/08/487 , 032B TIME: 09:12:48 

Input Set : D:\Seqlistcorr2.txt 

Output Set: N:\CRF4\11272002\H487032B.raw 

Invalid Line Length: 

The rules require that a line not exceed 72 characters in length. This includes spaces. 

Seq#:666; Line(s) 30012 
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VERIFICATION SUMMARY DATE: 11/27/2002 

PATENT APPLICATION: US/08/487 , 032B TIME: 09:12:48 

Input Set : D:\Seqlistcorr2.txt 

Output Set: N:\CRF4\ll272002\H487032B.raw 



L 
L 
L 
M: 
L: 
L: 
M: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 



:28 M:220 C: Keyword misspelled or invalid format, [(A) APPLICATION NUMBER: ] 
:29 M:220 C: Keyword misspelled or invalid format, [(B) FILING DATE:] 
:280 M:lll C: (47) String data converted to upper case, 
:111 Repeated in SeqNo=6 

:1844 M:220 C: Keyword misspelled or invalid format, [(2) INFORMATION FOR SEQ ID NO:] 
:2860 M:lll C: (47) String data converted to upper case, 
:111 Repeated in SeqNo=74 

:6118 M:220 C: Keyword misspelled or invalid format, [(xi) SEQUENCE DESCRIPTION: SEQ ID NO: ] 
:6191 M:220 C: Keyword misspelled or invalid format, [ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: ] 
:6390 M:220 C: Keyword misspelled or invalid format, [ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:] 
:6666 M:lll C: (47) String data converted to upper case, 

:7487 M:220 C: Keyword misspelled or invalid format, [ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:] 
:8652 M:220 C: Keyword misspelled or invalid format, [ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:] 
:8802 M:220 C: Keyword misspelled or invalid format, [ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:] 
:9469 M:lll C: (47) String data converted to upper case, 

Keyword misspelled or invalid format, [ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
Keyword misspelled or invalid format, [(2) INFORMATION FOR SEQ ID NO:] 



10267 M:220 C 
13814 M:220 C 
13989 M:lll C 
15383 M:341 W 
15386 M:341 W 
15488 M:341 W 
15588 M:341 W 
15614 M:341 W 
15647 M:341 W 
15862 M:341 W 
16101 M:341 W 
16110 M:341 W 
16286 M:341 W 
16330 M:341 W 
16336 M:341 W 
16532 M:341 W 
16767 M:341 W 
16770 M:341 W 
17183 M:341 W 
17212 M:341 W 
17268 M:341 W 
17431 M:341 W 
17608 M:341 W 
17640 M:341 W 
17646 M:341 W 
17788 M:341 W 
18347 M:220 C 
18392 M:341 W 
18659 M:341 W 
18668 M:341 W 
18807 M:341 W 
18817 M:332 E 
18994 M:341 W 



(47 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
{46 
(46 
(46 
(46 
(46 
(46 
(46 
(46 
Keyword 
(46 
(46 
(46 
(46 
(32 
(46 



String data converted to upper case, 



"Xaa" used 
"Xaa" used 



or 
or 
or 
or 
or 
or 
or 
or 
or 
or 
or 
or 

or "Xaa" used 
or "Xaa" used 
or "Xaa" used 
or "Xaa" 
or "Xaa" 
or "Xaa" 
or "Xaa" 
or "Xaa" 
or "Xaa" 
or "Xaa" 
or "Xaa" used 
misspelled or 
or "Xaa" used 
or "Xaa" used 
or "Xaa" used 
or "Xaa" used 



"Xaa" used 

"Xaa" used 

"Xaa" used 

"Xaa" used 

"Xaa" used 

"Xaa" used 

"Xaa" used 

"Xaa" used 

"Xaa" used 

"Xaa" used 



used 
used 
used 
used 
used 
used 
used 



for SEQ ID#:387 after pos . 
for SEQ ID#:387 after pos. 
for SEQ ID#:389 after pos. 
for SEQ ID#:391 after pos. 
for SEQ ID#:392 after pos. 
for SEQ ID#:392 after pos. 
for SEQ ID#:395 after pos. 
for SEQ ID#:399 after pos. 
for SEQ ID#:399 after pos. 
for SEQ ID#:402 after pos. 
for SEQ ID#:403 after pos. 
for SEQ ID#:403 after pos. 
for SEQ ID#:407 after pos. 
for SEQ ID#:412 after pos. 
for SEQ ID#:412 after pos. 
for SEQ ID#:421 after pos. 
for SEQ ID#:422 after pos. 
for SEQ ID#:423 after pos. 
for SEQ ID#:427 after pos. 
for SEQ ID#:432 after pos. 
for SEQ ID#:433 after pos. 
for SEQ ID#:433 after pos. 
for SEQ ID#:437 after pos. 
nvalid format, [(2) INFORMATION FOR SEQ ID NO:] 
for SEQ ID#:449 after pos.:48 
for SEQ ID#:454 after pos.:112 
for SEQ ID#:454 after pos.:160 



0 

16 
0 

48 

16 

192 

64 

240 

288 

256 

128 

160 

0 

48 

64 

80 

32 

128 

192 

48 

32 

64 

0 



for SEQ ID#:458 after pos.:48 
Invalid/Missing Amino Acid Numbering, SEQ ID: 458 
"n" or "Xaa" used, for SEQ ID#:461 after pos.: 160 
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VERIFICATION SUMMARY DATE: 11/27/2002 

PATENT APPLICATION: US/08/487 , 032B TIME: 09:12:48 

Input Set : D : \Seqlistcorr2 . txt 

Output Set: N:\CRF4\11272002\H487032B.raw 



L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L : 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L: 
L : 
M: 



19254 M: 
19550 M: 
20611 M: 
20614 M: 
20617 M: 

20620 M: 

20621 M: 
20630 M: 
20826 M: 
20829 M: 
20829 M: 
20832 M: 
20835 M: 
20838 M: 
21030 M: 
21122 M: 
21128 M: 
21131 M: 
21134 M: 
21276 M : 
21279 M : 
21553 M: 
21593 M: 
21608 M: 
22167 M: 
22173 M: 
22290 M: 
22320 M: 
22333 M: 
22436 M: 
23192 M: 
23208 M: 
23569 M: 
24040 M: 
25020 M: 
25020 M : 
25517 M: 
25517 M: 
26015 M: 
26337 M: 
26504 M: 
29075 M: 
31178 M: 
31233 M: 
31928 M: 
32818 M: 
32833 M: 
32994 M: 



341 W 
332 E 
341 W 
341 W 
341 W 
341 W 
332 E 
203 E 
341 W 
341 W 
330 E 
341 W 
341 W 
341 W 
203 E 
341 W 
341 W 
341 W 
341 W 
341 W 
341 W 
341 W 
341 W 
341 W 
341 W 
341 
330 
220 
341 
220 
330 
203 
220 
220 
332 
203 
333 
330 
220 
220 
220 
220 
220 
330 
220 
330 
330 
332 



(46) 
(32) 
(46) 
(46) 
(46) 
(46) 
(32) 
No. 
(46) 
(46) 
(2) 
(46) 

(46) "n" or 
(46) "n" or 
No. of Seq. 



or "Xaa" used, for SEQ ID#:467 after pos . : 0 



Invalid/Missing Amino Acid Numbering, SEQ ID: 473 / 

112 
128 
144 
160 

Invalid/Missing Amino Acid Numbering, SEQ 10:490^ 
of Seq. differs, LENGTH : Input : 166 Found:211 SEQ:490 

0 



n" or "Xaa" used, for SEQ ID#:4 90 after pos. 

n" or "Xaa" used, for SEQ ID#:4 90 after pos. 

n" or "Xaa" used, for SEQ ID#:4 90 after pos. 

n" or "Xaa" used, for SEQ ID#:4 90 after pos. 



/ 



(46) 
(46) 
(46) 
(46) 
(46) 
(46) 
(46) 
(46) 
(46) 
(46) 
(46) 



"n" or "Xaa" used, for SEQ ID#:4 94 after pos. 

"n" or "Xaa" used, for SEQ ID#:4 94 after pos. 
Invalid Amino Acid Designator, 1 y 

"n" or "Xaa" used, for SEQ ID#:4 94 after pos. 

"Xaa" used, for SEQ ID#:4 94 after pos. 

"Xaa" used, for SEQ ID#:4 94 after pos. 
differs, LENGTH : Input : 127 Found: 99 SEQ 

or "Xaa" used, for SEQ ID#:4 99 after pos. 

or "Xaa" used, for SEQ ID#:4 99 after pos. 

for SEQ ID#:4 99 after, pos . 

for SEQ ID#:499 after pos. 

for SEQ ID#:502 after pos. 

for SEQ ID#:502 after pos. 

for SEQ ID#:508 after pos. 

for SEQ ID#:509 after pos. 

for SEQ ID#:509 after pos. 

for SEQ ID#:519 after pos. 

for SEQ ID#:519 after pos. 



'n' 
'n f 
'n' 
'n' 
'n f 
•n' 
'n' 
'n' 
'n f 



or "Xaa" used, 
or "Xaa" used, 
or "Xaa" used, 
or "Xaa" used, 
or "Xaa" used, 
or "Xaa" used, 
or "Xaa" used, 
or "Xaa" used, 



16 

32 
48 
64 

497, 
0 

32 
48 
64 
160 
176 
16 
32 
112 
64 
96 



"n" or "Xaa" used, 
(2) Invalid Amino Acid Designator, 1 / 

Keyword misspelled or invalid format, [(2) INFORMATION FOR 
(46) "n" or "Xaa" used, for SEQ ID# : 523 after pos.:48 
Keyword misspelled or invalid format,, [ (xi) SEQUENCE DESCRI 
(2) Invalid Amino Acid Designator, 1 J 
No. of Seq. differs, LENGTH : Input : 14 8 Found:149 SEQ:540 
Keyword misspelled or invalid format, [(2) INFORMATION FOR 
Keyword misspelled or invalid format, [(2) INFORMATION FOR 
(32) Invalid/Missing Amino Acid Numbering, SEQ ID:578// 
No. of Seq. differs, LENGTH : Input : 90 Found: 91 SEQ: 57jj/ 
Wrong sequence grouping, Amino acids not in groups! 
(2) Invalid Amino Acid Designator, 1 
Keyword misspelled or invalid format, 
Keyword misspelled or invalid format, 
Keyword misspelled or invalid format, 
Keyword misspelled or invalid format, 
Keyword misspelled or invalid format, 
(2) Invalid Amino Acid Designator, 1 
Keyword misspelled or invalid format, 
(2) Invalid Amino Acid Designator, 1 j 
(2) Invalid Amino Acid Designator, 1 / 

(32) Invalid/Missing Amino Acid Numbering, SEQ ID: 727 



SEQ ID 
PTION: 



SEQ ID 
SEQ ID 



NO: ] 

SEQ ID NO: ] 



NO: ] 
NO: ] 



[(xi) SEQUENCE DESCRIPTION: 
[(xi) SEQUENCE DESCRIPTION: 
[(xi) SEQUENCE DESCRIPTION: 
[(2) INFORMATION FOR SEQ ID 
[(xi) SEQUENCE DESCRIPTION: 



SEQ ID NO 
SEQ ID NO 
SEQ ID NO 
NO: ] 

SEQ ID NO: ] 



[(xi) SEQUENCE DESCRIPTION: SEQ ID NO:] 



332 Repeated in SeqNo-727 
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VERIFICATION SUMMARY 

PATENT APPLICATION: US/08/487 , 032B 



DATE: 11/27/2002 
TIME: 09:12:48 



L: 


37498 


M: 


212 


E: 


L: 


38853 


M: 


330 


E: 


L: 


38926 


M: 


220 


C: 


L: 


39078 


M: 


330 


E: 


L: 


40255 


M: 


220 


C: 


L: 


41817 


M: 


220 


C: 


L: 


41817 


M: 


246 


W: 


L: 


43424 


M: 


220 


C: 


L: 


43513 


M: 


220 


C: 


L: 


43542 


M . 


220 


C: 


L: 


43594 


M: 


203 


E: 



Input Set : D:\Seqlistcorr2.txt 

Output Set: N:\CRF4\11272002\H487032B.raw 



(34) Invalid or duplicate Sequence ID Number, Value=[824:] 
(2) Invalid Amino Acid Designator, 1 / 
Keyword misspelled or invalid format, * [ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 
(2) Invalid Amino Acid Designator, 1* 

Keyword misspelled or invalid format, [(2) INFORMATION FOR SEQ ID NO:] 
Keyword misspelled or invalid format, [(D) TOPOLOGY : ] 

Invalid value of Alpha Sequence Header Field, [TOPOLOGY:], SeqNo=906 
Keyword misspelled or invalid format, '[(2) INFORMATION FOR SEQ ID NO:] 
Keyword misspelled or invalid format, [(2) INFORMATION FOR SEQ ID NO:] 
Keyword misspelled or invalid format, [(2) INFORMATION FOR S£Q ID NO:] 
No. of Seq. differs, LENGTH : Input : 55 Found: 194 SEQ: 941 / 
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